The 1 Algorithm

نویسنده

  • S. Sutton
چکیده

Appropriate bias is widely viewed as the key to efficient learning and generalization. I present a new algorithm, the Incremental Delta-Bar-Delta (IDBD) algorithm, for the learning of appropriate biases based on previous learning experience. The IDBD algorithm is developed for the case of a simple, linear learning system-the LMS or delta rule with a separate learning-rate parameter for each input. The IDBD algorithm adjusts the learning-rate parameters, which are an important form of bias for this system. Because bias in this approach is adapted based on previous learning experience, the appropriate testbeds are drifting or non-stationary learning tasks. For particular tasks of this type, I show that the IDBD algorithm performs better than ordinary LMS and in fact finds the optimal learning rates. The IDBD algorithm extends and improves over prior work by Jacobs and by me in that it is fully incremental and has only a single free parameter. This paper also extends previous work by presenting a derivation of the IDBD algorithm as gradient descent in the space of learning-rate parameters. Finally, I offer a novel interpretation of the IDBD algorithm as an incremental form of hold-one-out cross validation. People can learn very rapidly and generalize extremely accurately. Information theoretic arguments suggest that their inferences are too rapid to be justified by the data that they have immediately available to them. People can learn as fast as they do only because they bring to the learning situation a set of appropriate biases which direct them to prefer certain hypotheses over others. To approach human performance, ma chine learning systems will also need an appropriate set of biases. Where are these to come from? Although this problem is well known, there are few general answers. The field of pattern recognition has long known about the importance of feature selection, and the importance of representations is a recurring theme in AI. But in both of these cases the focus has always been on designing in a good bias, not on acquiring one automatically. This has resulted in an accumulation of specialized and non-extensible techniques. Is there an alternative? If bias is what a learner brings to a learning problem, then how could the learner itself generate an appropriate bias? The only way is to generate the bias from previous learning experience (e.g., Rendell, Seshu, & Tcheng 1987). And this is possible only if the learner encounters a series of different problems requiring the same or similar biases. I[ believe that is a correct characterization of the learning task facing people and real-world learning machines. In this paper, I present a new algorithm for learning appropriate biases for a linear learning system based on previous learning experience. The new algorithm is an extension of the Delta-Bar-Delta algorithm (Jacobs 1988; Sutton 1982; Barto & Sutton 1981; Kesten 1958) such that it is applicable to incremental tasks-supervised learning tasks in which the examples are processed one by one and then discarded. Accordingly, I call the new algorithm the Incremental Delta-Bar-Delta (IDBD) algorithm. The IDBD algorithm can be used to accelerate learning even on single problems, and that is the primary way in which its predecessors have been justified (e.g., Jacobs 1988; Silva & Almeida 1990; Lee & Lippmann 1990; Sutton 1986), but its greatest significance I believe is for nonstationary tasks or for sequences of related tasks, and it is on the former that I test it here.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Meta-Heuristic Algorithm based on Imperialist Competition Algorithm

The human has always been to find the best in all things. This Perfectionism has led to the creation of optimization methods. The goal of optimization is to determine the variables and find the best acceptable answer Due to the limitations of the problem, So that the objective function is minimum or maximum. One of the ways inaccurate optimization is meta-heuristics so that Inspired by nature, ...

متن کامل

A New Hybrid Routing Algorithm based on Genetic Algorithm and Simulated Annealing for Vehicular Ad hoc Networks

In recent years, Vehicular Ad-hoc Networks (VANET) as an emerging technology have tried to reduce road damage and car accidents through intelligent traffic controlling. In these networks, the rapid movement of vehicles, topology dynamics, and the limitations of network resources engender critical challenges in the routing process. Therefore, providing a stable and reliable routing algorithm is ...

متن کامل

Development of PSPO Simulation Optimization Algorithm

In this article a new algorithm is developed for optimizing computationally expensive simulation models. The optimization algorithm is developed for continues unconstrained single output simulation models. The algorithm is developed using two simulation optimization routines. We employed the nested partitioning (NP) routine for concentrating the search efforts in the regions which are most like...

متن کامل

Control of nonlinear systems using a hybrid APSO-BFO algorithm: An optimum design of PID controller

This paper proposes a novel hybrid algorithm namely APSO-BFO which combines merits of Bacterial Foraging Optimization (BFO) algorithm and Adaptive Particle Swarm Optimization (APSO) algorithm to determine the optimal PID parameters for control of nonlinear systems. To balance between exploration and exploitation, the proposed hybrid algorithm accomplishes global search over the whole search spa...

متن کامل

Using Modified IPSO-SQP Algorithm to Solve Nonlinear Time Optimal Bang-Bang Control Problem

In this paper, an intelligent-gradient based algorithm is proposed to solve time optimal bang-bang control problem. The proposed algorithm is a combination of an intelligent algorithm called improved particle swarm optimization algorithm (IPSO) in the first stage of optimization process together with a gradient-based algorithm called successive quadratic programming method (SQP) in the second s...

متن کامل

An Improved Imperialist Competitive Algorithm based on a new assimilation strategy

Meta-heuristic algorithms inspired by the natural processes are part of the optimization algorithms that they have been considered in recent years, such as genetic algorithm, particle swarm optimization, ant colony optimization, Firefly algorithm. Recently, a new kind of evolutionary algorithm has been proposed that it is inspired by the human sociopolitical evolution process. This new algorith...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999